Title: Case Study 1: Linear Regression
Authors: Will Butler, Robert (Reuven) Derner
Date: 8/24/23
We have a problem that has been brought to us from a group of scientists that are looking at superconductors. Superconductors are materials that give little or no resistance to electrical current.
The Scientists are looking at us to use the data provided to produce a model to predict new superconductors based on the properties and the data that they have found so far. Some of the data points include material composition, temperature at which they superconduct. We're going to examine the data set through exploratory data analysis.
The model desired is going to predict new superconductors and the temperature at which they operate based on the experimental inputs from the data that they have provided to us. The model needs to be interpretable so that the scientists' can figure out at what temperature new superconductors would become superconductors, not only if they would be superconductors. We will conduct a regression type of model to give the scientists ease of interpretability based on the relative importance of each feature in the model.
Data Source:
Provided by client with metadata dictionary regarding terms
# Import libraries
import pandas as pd
import seaborn as sns
import numpy as np
from numpy import mean
import matplotlib.pyplot as plt
import random
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio
from tabulate import tabulate
from sklearn import preprocessing
from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import Lasso
from sklearn.linear_model import Ridge
from sklearn import metrics
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import MinMaxScaler
from sklearn.datasets import make_classification
from sklearn.pipeline import Pipeline
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.metrics import classification_report, confusion_matrix,mean_squared_error
# Workbook settings
pd.set_option('display.max_columns', None)
random.seed(110)
pio.renderers.default='notebook'
# Import data from github (next phase)
url = 'https://raw.githubusercontent.com/ReuvenDerner/MSDS_QuantifyingTheWorld/main/train.csv'
data = pd.read_csv(url, encoding = "utf-8")
# Import data from github (next phase)
url2 = 'https://raw.githubusercontent.com/ReuvenDerner/MSDS_QuantifyingTheWorld/main/unique_m.csv'
unique_m = pd.read_csv(url2, encoding = "utf-8")
# loacal Import (To be removed later)
# data = pd.read_csv("C:/Users/robert.derner/OneDrive - Flagship Credit Acceptance/Documents/School/Quantifying The World/Case Study One/train.csv")
Describe the meaning and type of data (scale, values, etc.) for each attribute in the data file.
| Response Features | Definition |
|---|---|
| Critical Temp | When a superconductor reaches critical temperature and becomes a superconductor |
| Categorical Features | Definition |
|---|---|
| Number_of_elements | The number of periodic elements contained in the superconductor |
| Continuous Features | Definition |
|---|---|
| mean_atomic_mass | The average atomic mass |
wtd_mean_atomic_mass | The weighted average atomic mass | gmean_atomic_mass | g-average given the atomic mass | wtd_gmean_atomic_mass | Weighted g-average given the atomic mass | entropy_atomic_mass | The degree of disorder or uncertainy given the atomic mass | wtd_entropy_atomic_mass | Weighted average degree of disorder or uncertainty given the atomic mass | range_atomic_mass | Range of atomic mass | wtd_range_atomic_mass | Weight Range of atomic mass | std_atomic_mass | Standard Deviation of atomic mass | wtd_std_atomic_mass | Weighted standard deviation of atomic mass | mean_fie | Average of Fie | wtd_mean_fie | Weighted average of fie | gmean_fie | G-Average of Fie | wtd_gmean_fie | Weighted g-average of fie | entropy_fie | The degree of disorder or uncertainty of Fie | wtd_entropy_fie | Weighted degree of disorder or uncertainty of Fie | range_fie | Range of FIE | wtd_range_fie | Weighted Range of FIE | std_fie | Standard deviation of FIE | wtd_std_fie | Weighted Standard Deviation of FIE | mean_atomic_radius | The Average of atomic radius | wtd_mean_atomic_radius | The weighted average of atomic radius | gmean_atomic_radius | The g-average of atomic radius | wtd_gmean_atomic_radius | The weighted g-average of atomic radius | entropy_atomic_radius | The degree of disorder or uncertainty of atomic radius | wtd_entropy_atomic_radius | The weighted degree of disorder or uncertainty of atomic radius | range_atomic_radius | The range of atomic radius | wtd_range_atomic_radius | The weighted range of atomic radius | std_atomic_radius | The standard deviation of atomic radius | wtd_std_atomic_radius | The weighted standard deviation of atomic radius | mean_Density | The average Density | wtd_mean_Density | The weighted average Density | gmean_Density | The g-average Density | wtd_gmean_Density | The weghted g-average Density | entropy_Density | The degree of disorder or uncersity in Density | wtd_entropy_Density | The weighted degree of disorder or uncertainty in Density | range_Density | The range of Density | wtd_range_Density | The weighted range of Density | std_Density | The standard deviation of Density | wtd_std_Density | The weighted standard deviation of Density | mean_ElectronAffinity | The average of Electron Affinity | wtd_mean_ElectronAffinity | The weighted average of Electron Affinity | entropy_ElectronAffinity | The degree of disorder or uncersity in Electron Affinity | wtd_entropy_ElectronAffinity | The weighted degree of disorder or uncertainty in Electron Affinity | range_ElectronAffinity | The range of Electron Affinity | wtd_range_ElectronAffinity | The weighted range of Electron Affinity | std_ElectronAffinity | The standard deviation of Electron Affinity | wtd_std_ElectronAffinity | The wegithed standard deviation of Electron Affinity | mean_FusionHeat | The average of Fusion Heat | wtd_mean_FusionHeat | The weighted average of Fusion Heat | gmean_FusionHeat | The g-average of of Fusion Heat | wtd_gmean_FusionHeat | The weighted g-average of Fusion Heat | entropy_FusionHeat | The degree o fdisorder or uncertainty of Fusion Heat | wtd_entropy_FusionHeat | The weighted degree of disorder or uncertainity of Fusion Heat | range_FusionHeat | The range of Fusion Heat | wtd_range_FusionHeat | The weighted range of Fusion Heat | std_FusionHeat | The standard deviation of Fusion Heat | wtd_std_FusionHeat | The wegihted standard deviation of Fusion Heat | mean_ThermalConductivity | The average of Thermal Conductivity | wtd_mean_ThermalConductivity | The weighted average of Thermal Conductivity | gmean_ThermalConductivity | The g-average of Thermal Conductivity | wtd_gmean_ThermalConductivity | The wegihted g-mean of Thermal Conductivity | entropy_ThermalConductivity | The degree of disorder or uncertainty of Thermal Conductivity | wtd_entropy_ThermalConductivity | The weighted degree of disorder or uncertainty of Thermal Conductivity | range_ThermalConductivity | The range of Thermal Conductivity | wtd_range_ThermalConductivity | The weighted range of Thermal Conductivity | std_ThermalConductivity | The standard deviation of Thermal Conductivity | wtd_std_ThermalConductivity | The weighted standard Thermal Conductivity | mean_Valence | The average of Valence | wtd_mean_Valence | The weighted average of Valence | gmean_Valence | The g-average of Valence | wtd_gmean_Valence | The weighted g-average of Valence | entropy_Valence | The degree of disorder or uncertainty of Valence | wtd_entropy_Valence | The weighted degree of disorder or uncertainty of Valence | range_Valence | The range of Valence | wtd_range_Valence | The weghted range of Valence | std_Valence | The standard deviation of Valence | wtd_std_Valence | The standard deviation of Valence | |
data.head()
| number_of_elements | mean_atomic_mass | wtd_mean_atomic_mass | gmean_atomic_mass | wtd_gmean_atomic_mass | entropy_atomic_mass | wtd_entropy_atomic_mass | range_atomic_mass | wtd_range_atomic_mass | std_atomic_mass | wtd_std_atomic_mass | mean_fie | wtd_mean_fie | gmean_fie | wtd_gmean_fie | entropy_fie | wtd_entropy_fie | range_fie | wtd_range_fie | std_fie | wtd_std_fie | mean_atomic_radius | wtd_mean_atomic_radius | gmean_atomic_radius | wtd_gmean_atomic_radius | entropy_atomic_radius | wtd_entropy_atomic_radius | range_atomic_radius | wtd_range_atomic_radius | std_atomic_radius | wtd_std_atomic_radius | mean_Density | wtd_mean_Density | gmean_Density | wtd_gmean_Density | entropy_Density | wtd_entropy_Density | range_Density | wtd_range_Density | std_Density | wtd_std_Density | mean_ElectronAffinity | wtd_mean_ElectronAffinity | gmean_ElectronAffinity | wtd_gmean_ElectronAffinity | entropy_ElectronAffinity | wtd_entropy_ElectronAffinity | range_ElectronAffinity | wtd_range_ElectronAffinity | std_ElectronAffinity | wtd_std_ElectronAffinity | mean_FusionHeat | wtd_mean_FusionHeat | gmean_FusionHeat | wtd_gmean_FusionHeat | entropy_FusionHeat | wtd_entropy_FusionHeat | range_FusionHeat | wtd_range_FusionHeat | std_FusionHeat | wtd_std_FusionHeat | mean_ThermalConductivity | wtd_mean_ThermalConductivity | gmean_ThermalConductivity | wtd_gmean_ThermalConductivity | entropy_ThermalConductivity | wtd_entropy_ThermalConductivity | range_ThermalConductivity | wtd_range_ThermalConductivity | std_ThermalConductivity | wtd_std_ThermalConductivity | mean_Valence | wtd_mean_Valence | gmean_Valence | wtd_gmean_Valence | entropy_Valence | wtd_entropy_Valence | range_Valence | wtd_range_Valence | std_Valence | wtd_std_Valence | critical_temp | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 4 | 88.944468 | 57.862692 | 66.361592 | 36.116612 | 1.181795 | 1.062396 | 122.90607 | 31.794921 | 51.968828 | 53.622535 | 775.425 | 1010.268571 | 718.152900 | 938.016780 | 1.305967 | 0.791488 | 810.6 | 735.985714 | 323.811808 | 355.562967 | 160.25 | 105.514286 | 136.126003 | 84.528423 | 1.259244 | 1.207040 | 205 | 42.914286 | 75.237540 | 69.235569 | 4654.35725 | 2961.502286 | 724.953211 | 53.543811 | 1.033129 | 0.814598 | 8958.571 | 1579.583429 | 3306.162897 | 3572.596624 | 81.8375 | 111.727143 | 60.123179 | 99.414682 | 1.159687 | 0.787382 | 127.05 | 80.987143 | 51.433712 | 42.558396 | 6.9055 | 3.846857 | 3.479475 | 1.040986 | 1.088575 | 0.994998 | 12.878 | 1.744571 | 4.599064 | 4.666920 | 107.756645 | 61.015189 | 7.062488 | 0.621979 | 0.308148 | 0.262848 | 399.97342 | 57.127669 | 168.854244 | 138.517163 | 2.25 | 2.257143 | 2.213364 | 2.219783 | 1.368922 | 1.066221 | 1 | 1.085714 | 0.433013 | 0.437059 | 29.0 |
| 1 | 5 | 92.729214 | 58.518416 | 73.132787 | 36.396602 | 1.449309 | 1.057755 | 122.90607 | 36.161939 | 47.094633 | 53.979870 | 766.440 | 1010.612857 | 720.605511 | 938.745413 | 1.544145 | 0.807078 | 810.6 | 743.164286 | 290.183029 | 354.963511 | 161.20 | 104.971429 | 141.465215 | 84.370167 | 1.508328 | 1.204115 | 205 | 50.571429 | 67.321319 | 68.008817 | 5821.48580 | 3021.016571 | 1237.095080 | 54.095718 | 1.314442 | 0.914802 | 10488.571 | 1667.383429 | 3767.403176 | 3632.649185 | 90.8900 | 112.316429 | 69.833315 | 101.166398 | 1.427997 | 0.838666 | 127.05 | 81.207857 | 49.438167 | 41.667621 | 7.7844 | 3.796857 | 4.403790 | 1.035251 | 1.374977 | 1.073094 | 12.878 | 1.595714 | 4.473363 | 4.603000 | 172.205316 | 61.372331 | 16.064228 | 0.619735 | 0.847404 | 0.567706 | 429.97342 | 51.413383 | 198.554600 | 139.630922 | 2.00 | 2.257143 | 1.888175 | 2.210679 | 1.557113 | 1.047221 | 2 | 1.128571 | 0.632456 | 0.468606 | 26.0 |
| 2 | 4 | 88.944468 | 57.885242 | 66.361592 | 36.122509 | 1.181795 | 0.975980 | 122.90607 | 35.741099 | 51.968828 | 53.656268 | 775.425 | 1010.820000 | 718.152900 | 939.009036 | 1.305967 | 0.773620 | 810.6 | 743.164286 | 323.811808 | 354.804183 | 160.25 | 104.685714 | 136.126003 | 84.214573 | 1.259244 | 1.132547 | 205 | 49.314286 | 75.237540 | 67.797712 | 4654.35725 | 2999.159429 | 724.953211 | 53.974022 | 1.033129 | 0.760305 | 8958.571 | 1667.383429 | 3306.162897 | 3592.019281 | 81.8375 | 112.213571 | 60.123179 | 101.082152 | 1.159687 | 0.786007 | 127.05 | 81.207857 | 51.433712 | 41.639878 | 6.9055 | 3.822571 | 3.479475 | 1.037439 | 1.088575 | 0.927479 | 12.878 | 1.757143 | 4.599064 | 4.649635 | 107.756645 | 60.943760 | 7.062488 | 0.619095 | 0.308148 | 0.250477 | 399.97342 | 57.127669 | 168.854244 | 138.540613 | 2.25 | 2.271429 | 2.213364 | 2.232679 | 1.368922 | 1.029175 | 1 | 1.114286 | 0.433013 | 0.444697 | 19.0 |
| 3 | 4 | 88.944468 | 57.873967 | 66.361592 | 36.119560 | 1.181795 | 1.022291 | 122.90607 | 33.768010 | 51.968828 | 53.639405 | 775.425 | 1010.544286 | 718.152900 | 938.512777 | 1.305967 | 0.783207 | 810.6 | 739.575000 | 323.811808 | 355.183884 | 160.25 | 105.100000 | 136.126003 | 84.371352 | 1.259244 | 1.173033 | 205 | 46.114286 | 75.237540 | 68.521665 | 4654.35725 | 2980.330857 | 724.953211 | 53.758486 | 1.033129 | 0.788889 | 8958.571 | 1623.483429 | 3306.162897 | 3582.370597 | 81.8375 | 111.970357 | 60.123179 | 100.244950 | 1.159687 | 0.786900 | 127.05 | 81.097500 | 51.433712 | 42.102344 | 6.9055 | 3.834714 | 3.479475 | 1.039211 | 1.088575 | 0.964031 | 12.878 | 1.744571 | 4.599064 | 4.658301 | 107.756645 | 60.979474 | 7.062488 | 0.620535 | 0.308148 | 0.257045 | 399.97342 | 57.127669 | 168.854244 | 138.528893 | 2.25 | 2.264286 | 2.213364 | 2.226222 | 1.368922 | 1.048834 | 1 | 1.100000 | 0.433013 | 0.440952 | 22.0 |
| 4 | 4 | 88.944468 | 57.840143 | 66.361592 | 36.110716 | 1.181795 | 1.129224 | 122.90607 | 27.848743 | 51.968828 | 53.588771 | 775.425 | 1009.717143 | 718.152900 | 937.025573 | 1.305967 | 0.805230 | 810.6 | 728.807143 | 323.811808 | 356.319281 | 160.25 | 106.342857 | 136.126003 | 84.843442 | 1.259244 | 1.261194 | 205 | 36.514286 | 75.237540 | 70.634448 | 4654.35725 | 2923.845143 | 724.953211 | 53.117029 | 1.033129 | 0.859811 | 8958.571 | 1491.783429 | 3306.162897 | 3552.668664 | 81.8375 | 111.240714 | 60.123179 | 97.774719 | 1.159687 | 0.787396 | 127.05 | 80.766429 | 51.433712 | 43.452059 | 6.9055 | 3.871143 | 3.479475 | 1.044545 | 1.088575 | 1.044970 | 12.878 | 1.744571 | 4.599064 | 4.684014 | 107.756645 | 61.086617 | 7.062488 | 0.624878 | 0.308148 | 0.272820 | 399.97342 | 57.127669 | 168.854244 | 138.493671 | 2.25 | 2.242857 | 2.213364 | 2.206963 | 1.368922 | 1.096052 | 1 | 1.057143 | 0.433013 | 0.428809 | 23.0 |
unique_m.head()
| H | He | Li | Be | B | C | N | O | F | Ne | Na | Mg | Al | Si | P | S | Cl | Ar | K | Ca | Sc | Ti | V | Cr | Mn | Fe | Co | Ni | Cu | Zn | Ga | Ge | As | Se | Br | Kr | Rb | Sr | Y | Zr | Nb | Mo | Tc | Ru | Rh | Pd | Ag | Cd | In | Sn | Sb | Te | I | Xe | Cs | Ba | La | Ce | Pr | Nd | Pm | Sm | Eu | Gd | Tb | Dy | Ho | Er | Tm | Yb | Lu | Hf | Ta | W | Re | Os | Ir | Pt | Au | Hg | Tl | Pb | Bi | Po | At | Rn | critical_temp | material | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.20 | 1.80 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 | 29.0 | Ba0.2La1.8Cu1O4 |
| 1 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.9 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.10 | 1.90 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 | 26.0 | Ba0.1La1.9Ag0.1Cu0.9O4 |
| 2 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.10 | 1.90 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 | 19.0 | Ba0.1La1.9Cu1O4 |
| 3 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.15 | 1.85 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 | 22.0 | Ba0.15La1.85Cu1O4 |
| 4 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.30 | 1.70 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 | 23.0 | Ba0.3La1.7Cu1O4 |
unique_m.shape
(21263, 88)
data.shape
(21263, 82)
Both files from the study have the same number of records, one contains the number of elements utilized while the other contains the significant statistical measures of various components of the superconductors.
Missing Values
The dataset contains no missing values. There is nothing for us to imputate or reshape.
For the purposes of regression on the Critical Temp, we may proceed with the analysis.
# Features with Null Values and Percent missing
null_df = pd.DataFrame(data[data.columns[data.isnull().any()]].isnull().sum()).reset_index()
null_df.columns = ['Feature', 'Value']
null_df['Percent'] = round((null_df['Value'] / data.shape[0] * 100),2)
null_df
| Feature | Value | Percent |
|---|
# Features with Null Values and Percent missing
null_df = pd.DataFrame(unique_m[unique_m.columns[unique_m.isnull().any()]].isnull().sum()).reset_index()
null_df.columns = ['Feature', 'Value']
null_df['Percent'] = round((null_df['Value'] / unique_m.shape[0] * 100),2)
null_df
| Feature | Value | Percent |
|---|
Duplicate Values
There are 66 duplicate values in the data set. No action was needed.
# Duplicate record validation
data.duplicated().sum()
66
# Duplicate record validation
unique_m.duplicated().sum()
0
### Examine uniquenss of response variable
is_unique = data['critical_temp'].nunique() == data.shape[0]
print(is_unique)
False
merged_df = pd.merge(data, unique_m, left_index=True, right_index=True, how="left") #Thank you google
merged_df.head()
| number_of_elements | mean_atomic_mass | wtd_mean_atomic_mass | gmean_atomic_mass | wtd_gmean_atomic_mass | entropy_atomic_mass | wtd_entropy_atomic_mass | range_atomic_mass | wtd_range_atomic_mass | std_atomic_mass | wtd_std_atomic_mass | mean_fie | wtd_mean_fie | gmean_fie | wtd_gmean_fie | entropy_fie | wtd_entropy_fie | range_fie | wtd_range_fie | std_fie | wtd_std_fie | mean_atomic_radius | wtd_mean_atomic_radius | gmean_atomic_radius | wtd_gmean_atomic_radius | entropy_atomic_radius | wtd_entropy_atomic_radius | range_atomic_radius | wtd_range_atomic_radius | std_atomic_radius | wtd_std_atomic_radius | mean_Density | wtd_mean_Density | gmean_Density | wtd_gmean_Density | entropy_Density | wtd_entropy_Density | range_Density | wtd_range_Density | std_Density | wtd_std_Density | mean_ElectronAffinity | wtd_mean_ElectronAffinity | gmean_ElectronAffinity | wtd_gmean_ElectronAffinity | entropy_ElectronAffinity | wtd_entropy_ElectronAffinity | range_ElectronAffinity | wtd_range_ElectronAffinity | std_ElectronAffinity | wtd_std_ElectronAffinity | mean_FusionHeat | wtd_mean_FusionHeat | gmean_FusionHeat | wtd_gmean_FusionHeat | entropy_FusionHeat | wtd_entropy_FusionHeat | range_FusionHeat | wtd_range_FusionHeat | std_FusionHeat | wtd_std_FusionHeat | mean_ThermalConductivity | wtd_mean_ThermalConductivity | gmean_ThermalConductivity | wtd_gmean_ThermalConductivity | entropy_ThermalConductivity | wtd_entropy_ThermalConductivity | range_ThermalConductivity | wtd_range_ThermalConductivity | std_ThermalConductivity | wtd_std_ThermalConductivity | mean_Valence | wtd_mean_Valence | gmean_Valence | wtd_gmean_Valence | entropy_Valence | wtd_entropy_Valence | range_Valence | wtd_range_Valence | std_Valence | wtd_std_Valence | critical_temp_x | H | He | Li | Be | B | C | N | O | F | Ne | Na | Mg | Al | Si | P | S | Cl | Ar | K | Ca | Sc | Ti | V | Cr | Mn | Fe | Co | Ni | Cu | Zn | Ga | Ge | As | Se | Br | Kr | Rb | Sr | Y | Zr | Nb | Mo | Tc | Ru | Rh | Pd | Ag | Cd | In | Sn | Sb | Te | I | Xe | Cs | Ba | La | Ce | Pr | Nd | Pm | Sm | Eu | Gd | Tb | Dy | Ho | Er | Tm | Yb | Lu | Hf | Ta | W | Re | Os | Ir | Pt | Au | Hg | Tl | Pb | Bi | Po | At | Rn | critical_temp_y | material | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 4 | 88.944468 | 57.862692 | 66.361592 | 36.116612 | 1.181795 | 1.062396 | 122.90607 | 31.794921 | 51.968828 | 53.622535 | 775.425 | 1010.268571 | 718.152900 | 938.016780 | 1.305967 | 0.791488 | 810.6 | 735.985714 | 323.811808 | 355.562967 | 160.25 | 105.514286 | 136.126003 | 84.528423 | 1.259244 | 1.207040 | 205 | 42.914286 | 75.237540 | 69.235569 | 4654.35725 | 2961.502286 | 724.953211 | 53.543811 | 1.033129 | 0.814598 | 8958.571 | 1579.583429 | 3306.162897 | 3572.596624 | 81.8375 | 111.727143 | 60.123179 | 99.414682 | 1.159687 | 0.787382 | 127.05 | 80.987143 | 51.433712 | 42.558396 | 6.9055 | 3.846857 | 3.479475 | 1.040986 | 1.088575 | 0.994998 | 12.878 | 1.744571 | 4.599064 | 4.666920 | 107.756645 | 61.015189 | 7.062488 | 0.621979 | 0.308148 | 0.262848 | 399.97342 | 57.127669 | 168.854244 | 138.517163 | 2.25 | 2.257143 | 2.213364 | 2.219783 | 1.368922 | 1.066221 | 1 | 1.085714 | 0.433013 | 0.437059 | 29.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.20 | 1.80 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 | 29.0 | Ba0.2La1.8Cu1O4 |
| 1 | 5 | 92.729214 | 58.518416 | 73.132787 | 36.396602 | 1.449309 | 1.057755 | 122.90607 | 36.161939 | 47.094633 | 53.979870 | 766.440 | 1010.612857 | 720.605511 | 938.745413 | 1.544145 | 0.807078 | 810.6 | 743.164286 | 290.183029 | 354.963511 | 161.20 | 104.971429 | 141.465215 | 84.370167 | 1.508328 | 1.204115 | 205 | 50.571429 | 67.321319 | 68.008817 | 5821.48580 | 3021.016571 | 1237.095080 | 54.095718 | 1.314442 | 0.914802 | 10488.571 | 1667.383429 | 3767.403176 | 3632.649185 | 90.8900 | 112.316429 | 69.833315 | 101.166398 | 1.427997 | 0.838666 | 127.05 | 81.207857 | 49.438167 | 41.667621 | 7.7844 | 3.796857 | 4.403790 | 1.035251 | 1.374977 | 1.073094 | 12.878 | 1.595714 | 4.473363 | 4.603000 | 172.205316 | 61.372331 | 16.064228 | 0.619735 | 0.847404 | 0.567706 | 429.97342 | 51.413383 | 198.554600 | 139.630922 | 2.00 | 2.257143 | 1.888175 | 2.210679 | 1.557113 | 1.047221 | 2 | 1.128571 | 0.632456 | 0.468606 | 26.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.9 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.10 | 1.90 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 | 26.0 | Ba0.1La1.9Ag0.1Cu0.9O4 |
| 2 | 4 | 88.944468 | 57.885242 | 66.361592 | 36.122509 | 1.181795 | 0.975980 | 122.90607 | 35.741099 | 51.968828 | 53.656268 | 775.425 | 1010.820000 | 718.152900 | 939.009036 | 1.305967 | 0.773620 | 810.6 | 743.164286 | 323.811808 | 354.804183 | 160.25 | 104.685714 | 136.126003 | 84.214573 | 1.259244 | 1.132547 | 205 | 49.314286 | 75.237540 | 67.797712 | 4654.35725 | 2999.159429 | 724.953211 | 53.974022 | 1.033129 | 0.760305 | 8958.571 | 1667.383429 | 3306.162897 | 3592.019281 | 81.8375 | 112.213571 | 60.123179 | 101.082152 | 1.159687 | 0.786007 | 127.05 | 81.207857 | 51.433712 | 41.639878 | 6.9055 | 3.822571 | 3.479475 | 1.037439 | 1.088575 | 0.927479 | 12.878 | 1.757143 | 4.599064 | 4.649635 | 107.756645 | 60.943760 | 7.062488 | 0.619095 | 0.308148 | 0.250477 | 399.97342 | 57.127669 | 168.854244 | 138.540613 | 2.25 | 2.271429 | 2.213364 | 2.232679 | 1.368922 | 1.029175 | 1 | 1.114286 | 0.433013 | 0.444697 | 19.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.10 | 1.90 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 | 19.0 | Ba0.1La1.9Cu1O4 |
| 3 | 4 | 88.944468 | 57.873967 | 66.361592 | 36.119560 | 1.181795 | 1.022291 | 122.90607 | 33.768010 | 51.968828 | 53.639405 | 775.425 | 1010.544286 | 718.152900 | 938.512777 | 1.305967 | 0.783207 | 810.6 | 739.575000 | 323.811808 | 355.183884 | 160.25 | 105.100000 | 136.126003 | 84.371352 | 1.259244 | 1.173033 | 205 | 46.114286 | 75.237540 | 68.521665 | 4654.35725 | 2980.330857 | 724.953211 | 53.758486 | 1.033129 | 0.788889 | 8958.571 | 1623.483429 | 3306.162897 | 3582.370597 | 81.8375 | 111.970357 | 60.123179 | 100.244950 | 1.159687 | 0.786900 | 127.05 | 81.097500 | 51.433712 | 42.102344 | 6.9055 | 3.834714 | 3.479475 | 1.039211 | 1.088575 | 0.964031 | 12.878 | 1.744571 | 4.599064 | 4.658301 | 107.756645 | 60.979474 | 7.062488 | 0.620535 | 0.308148 | 0.257045 | 399.97342 | 57.127669 | 168.854244 | 138.528893 | 2.25 | 2.264286 | 2.213364 | 2.226222 | 1.368922 | 1.048834 | 1 | 1.100000 | 0.433013 | 0.440952 | 22.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.15 | 1.85 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 | 22.0 | Ba0.15La1.85Cu1O4 |
| 4 | 4 | 88.944468 | 57.840143 | 66.361592 | 36.110716 | 1.181795 | 1.129224 | 122.90607 | 27.848743 | 51.968828 | 53.588771 | 775.425 | 1009.717143 | 718.152900 | 937.025573 | 1.305967 | 0.805230 | 810.6 | 728.807143 | 323.811808 | 356.319281 | 160.25 | 106.342857 | 136.126003 | 84.843442 | 1.259244 | 1.261194 | 205 | 36.514286 | 75.237540 | 70.634448 | 4654.35725 | 2923.845143 | 724.953211 | 53.117029 | 1.033129 | 0.859811 | 8958.571 | 1491.783429 | 3306.162897 | 3552.668664 | 81.8375 | 111.240714 | 60.123179 | 97.774719 | 1.159687 | 0.787396 | 127.05 | 80.766429 | 51.433712 | 43.452059 | 6.9055 | 3.871143 | 3.479475 | 1.044545 | 1.088575 | 1.044970 | 12.878 | 1.744571 | 4.599064 | 4.684014 | 107.756645 | 61.086617 | 7.062488 | 0.624878 | 0.308148 | 0.272820 | 399.97342 | 57.127669 | 168.854244 | 138.493671 | 2.25 | 2.242857 | 2.213364 | 2.206963 | 1.368922 | 1.096052 | 1 | 1.057143 | 0.433013 | 0.428809 | 23.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.30 | 1.70 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 | 23.0 | Ba0.3La1.7Cu1O4 |
# Check there is no cartisian
merged_df.shape
(21263, 170)
# Drop the secondary critical temp from unique_m dataset
merged_df_final = merged_df.drop(['critical_temp_y','material'], axis=1)
# Check the secondary critical temp is removed
merged_df_final.head()
| number_of_elements | mean_atomic_mass | wtd_mean_atomic_mass | gmean_atomic_mass | wtd_gmean_atomic_mass | entropy_atomic_mass | wtd_entropy_atomic_mass | range_atomic_mass | wtd_range_atomic_mass | std_atomic_mass | wtd_std_atomic_mass | mean_fie | wtd_mean_fie | gmean_fie | wtd_gmean_fie | entropy_fie | wtd_entropy_fie | range_fie | wtd_range_fie | std_fie | wtd_std_fie | mean_atomic_radius | wtd_mean_atomic_radius | gmean_atomic_radius | wtd_gmean_atomic_radius | entropy_atomic_radius | wtd_entropy_atomic_radius | range_atomic_radius | wtd_range_atomic_radius | std_atomic_radius | wtd_std_atomic_radius | mean_Density | wtd_mean_Density | gmean_Density | wtd_gmean_Density | entropy_Density | wtd_entropy_Density | range_Density | wtd_range_Density | std_Density | wtd_std_Density | mean_ElectronAffinity | wtd_mean_ElectronAffinity | gmean_ElectronAffinity | wtd_gmean_ElectronAffinity | entropy_ElectronAffinity | wtd_entropy_ElectronAffinity | range_ElectronAffinity | wtd_range_ElectronAffinity | std_ElectronAffinity | wtd_std_ElectronAffinity | mean_FusionHeat | wtd_mean_FusionHeat | gmean_FusionHeat | wtd_gmean_FusionHeat | entropy_FusionHeat | wtd_entropy_FusionHeat | range_FusionHeat | wtd_range_FusionHeat | std_FusionHeat | wtd_std_FusionHeat | mean_ThermalConductivity | wtd_mean_ThermalConductivity | gmean_ThermalConductivity | wtd_gmean_ThermalConductivity | entropy_ThermalConductivity | wtd_entropy_ThermalConductivity | range_ThermalConductivity | wtd_range_ThermalConductivity | std_ThermalConductivity | wtd_std_ThermalConductivity | mean_Valence | wtd_mean_Valence | gmean_Valence | wtd_gmean_Valence | entropy_Valence | wtd_entropy_Valence | range_Valence | wtd_range_Valence | std_Valence | wtd_std_Valence | critical_temp_x | H | He | Li | Be | B | C | N | O | F | Ne | Na | Mg | Al | Si | P | S | Cl | Ar | K | Ca | Sc | Ti | V | Cr | Mn | Fe | Co | Ni | Cu | Zn | Ga | Ge | As | Se | Br | Kr | Rb | Sr | Y | Zr | Nb | Mo | Tc | Ru | Rh | Pd | Ag | Cd | In | Sn | Sb | Te | I | Xe | Cs | Ba | La | Ce | Pr | Nd | Pm | Sm | Eu | Gd | Tb | Dy | Ho | Er | Tm | Yb | Lu | Hf | Ta | W | Re | Os | Ir | Pt | Au | Hg | Tl | Pb | Bi | Po | At | Rn | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 4 | 88.944468 | 57.862692 | 66.361592 | 36.116612 | 1.181795 | 1.062396 | 122.90607 | 31.794921 | 51.968828 | 53.622535 | 775.425 | 1010.268571 | 718.152900 | 938.016780 | 1.305967 | 0.791488 | 810.6 | 735.985714 | 323.811808 | 355.562967 | 160.25 | 105.514286 | 136.126003 | 84.528423 | 1.259244 | 1.207040 | 205 | 42.914286 | 75.237540 | 69.235569 | 4654.35725 | 2961.502286 | 724.953211 | 53.543811 | 1.033129 | 0.814598 | 8958.571 | 1579.583429 | 3306.162897 | 3572.596624 | 81.8375 | 111.727143 | 60.123179 | 99.414682 | 1.159687 | 0.787382 | 127.05 | 80.987143 | 51.433712 | 42.558396 | 6.9055 | 3.846857 | 3.479475 | 1.040986 | 1.088575 | 0.994998 | 12.878 | 1.744571 | 4.599064 | 4.666920 | 107.756645 | 61.015189 | 7.062488 | 0.621979 | 0.308148 | 0.262848 | 399.97342 | 57.127669 | 168.854244 | 138.517163 | 2.25 | 2.257143 | 2.213364 | 2.219783 | 1.368922 | 1.066221 | 1 | 1.085714 | 0.433013 | 0.437059 | 29.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.20 | 1.80 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 |
| 1 | 5 | 92.729214 | 58.518416 | 73.132787 | 36.396602 | 1.449309 | 1.057755 | 122.90607 | 36.161939 | 47.094633 | 53.979870 | 766.440 | 1010.612857 | 720.605511 | 938.745413 | 1.544145 | 0.807078 | 810.6 | 743.164286 | 290.183029 | 354.963511 | 161.20 | 104.971429 | 141.465215 | 84.370167 | 1.508328 | 1.204115 | 205 | 50.571429 | 67.321319 | 68.008817 | 5821.48580 | 3021.016571 | 1237.095080 | 54.095718 | 1.314442 | 0.914802 | 10488.571 | 1667.383429 | 3767.403176 | 3632.649185 | 90.8900 | 112.316429 | 69.833315 | 101.166398 | 1.427997 | 0.838666 | 127.05 | 81.207857 | 49.438167 | 41.667621 | 7.7844 | 3.796857 | 4.403790 | 1.035251 | 1.374977 | 1.073094 | 12.878 | 1.595714 | 4.473363 | 4.603000 | 172.205316 | 61.372331 | 16.064228 | 0.619735 | 0.847404 | 0.567706 | 429.97342 | 51.413383 | 198.554600 | 139.630922 | 2.00 | 2.257143 | 1.888175 | 2.210679 | 1.557113 | 1.047221 | 2 | 1.128571 | 0.632456 | 0.468606 | 26.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.9 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.10 | 1.90 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 |
| 2 | 4 | 88.944468 | 57.885242 | 66.361592 | 36.122509 | 1.181795 | 0.975980 | 122.90607 | 35.741099 | 51.968828 | 53.656268 | 775.425 | 1010.820000 | 718.152900 | 939.009036 | 1.305967 | 0.773620 | 810.6 | 743.164286 | 323.811808 | 354.804183 | 160.25 | 104.685714 | 136.126003 | 84.214573 | 1.259244 | 1.132547 | 205 | 49.314286 | 75.237540 | 67.797712 | 4654.35725 | 2999.159429 | 724.953211 | 53.974022 | 1.033129 | 0.760305 | 8958.571 | 1667.383429 | 3306.162897 | 3592.019281 | 81.8375 | 112.213571 | 60.123179 | 101.082152 | 1.159687 | 0.786007 | 127.05 | 81.207857 | 51.433712 | 41.639878 | 6.9055 | 3.822571 | 3.479475 | 1.037439 | 1.088575 | 0.927479 | 12.878 | 1.757143 | 4.599064 | 4.649635 | 107.756645 | 60.943760 | 7.062488 | 0.619095 | 0.308148 | 0.250477 | 399.97342 | 57.127669 | 168.854244 | 138.540613 | 2.25 | 2.271429 | 2.213364 | 2.232679 | 1.368922 | 1.029175 | 1 | 1.114286 | 0.433013 | 0.444697 | 19.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.10 | 1.90 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 |
| 3 | 4 | 88.944468 | 57.873967 | 66.361592 | 36.119560 | 1.181795 | 1.022291 | 122.90607 | 33.768010 | 51.968828 | 53.639405 | 775.425 | 1010.544286 | 718.152900 | 938.512777 | 1.305967 | 0.783207 | 810.6 | 739.575000 | 323.811808 | 355.183884 | 160.25 | 105.100000 | 136.126003 | 84.371352 | 1.259244 | 1.173033 | 205 | 46.114286 | 75.237540 | 68.521665 | 4654.35725 | 2980.330857 | 724.953211 | 53.758486 | 1.033129 | 0.788889 | 8958.571 | 1623.483429 | 3306.162897 | 3582.370597 | 81.8375 | 111.970357 | 60.123179 | 100.244950 | 1.159687 | 0.786900 | 127.05 | 81.097500 | 51.433712 | 42.102344 | 6.9055 | 3.834714 | 3.479475 | 1.039211 | 1.088575 | 0.964031 | 12.878 | 1.744571 | 4.599064 | 4.658301 | 107.756645 | 60.979474 | 7.062488 | 0.620535 | 0.308148 | 0.257045 | 399.97342 | 57.127669 | 168.854244 | 138.528893 | 2.25 | 2.264286 | 2.213364 | 2.226222 | 1.368922 | 1.048834 | 1 | 1.100000 | 0.433013 | 0.440952 | 22.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.15 | 1.85 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 |
| 4 | 4 | 88.944468 | 57.840143 | 66.361592 | 36.110716 | 1.181795 | 1.129224 | 122.90607 | 27.848743 | 51.968828 | 53.588771 | 775.425 | 1009.717143 | 718.152900 | 937.025573 | 1.305967 | 0.805230 | 810.6 | 728.807143 | 323.811808 | 356.319281 | 160.25 | 106.342857 | 136.126003 | 84.843442 | 1.259244 | 1.261194 | 205 | 36.514286 | 75.237540 | 70.634448 | 4654.35725 | 2923.845143 | 724.953211 | 53.117029 | 1.033129 | 0.859811 | 8958.571 | 1491.783429 | 3306.162897 | 3552.668664 | 81.8375 | 111.240714 | 60.123179 | 97.774719 | 1.159687 | 0.787396 | 127.05 | 80.766429 | 51.433712 | 43.452059 | 6.9055 | 3.871143 | 3.479475 | 1.044545 | 1.088575 | 1.044970 | 12.878 | 1.744571 | 4.599064 | 4.684014 | 107.756645 | 61.086617 | 7.062488 | 0.624878 | 0.308148 | 0.272820 | 399.97342 | 57.127669 | 168.854244 | 138.493671 | 2.25 | 2.242857 | 2.213364 | 2.206963 | 1.368922 | 1.096052 | 1 | 1.057143 | 0.433013 | 0.428809 | 23.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.30 | 1.70 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0 | 0 | 0 |
Data Type Conversion
In this section we grouped all features by their correct data type and converted each to their coresponding group. This facilitates a much easier analysis into the statistics of each feature type.
# Features grouped by data type
cat_features = ['number_of_elements','H','He','Li','Be','B','C','N','O','F','Ne','Na','Mg','Al','Si','P','S','Cl','Ar','K',
'Ca','Sc','Ti','V','Cr','Mn','Fe','Co','Ni','Cu','Zn','Ga','Ge','As','Se','Br','Kr','Rb','Sr','Y','Zr','Nb',
'Mo','Tc','Ru','Rh','Pd','Ag','Cd','In','Sn','Sb','Te','I','Xe','Cs','Ba','La','Ce','Pr','Nd','Pm','Sm','Eu',
'Gd','Tb','Dy','Ho','Er','Tm','Yb','Lu','Hf','Ta','W','Re','Os','Ir','Pt','Au','Hg','Tl','Pb','Bi','Po','At',
'Rn']
cont_features = ['mean_atomic_mass','wtd_mean_atomic_mass','gmean_atomic_mass','wtd_gmean_atomic_mass',
'entropy_atomic_mass','wtd_entropy_atomic_mass','range_atomic_mass','wtd_range_atomic_mass','std_atomic_mass',
'wtd_std_atomic_mass','mean_fie', 'wtd_mean_fie','gmean_fie','wtd_gmean_fie','entropy_fie','wtd_entropy_fie',
'range_fie','wtd_range_fie','std_fie','wtd_std_fie','mean_atomic_radius','wtd_mean_atomic_radius',
'gmean_atomic_radius','wtd_gmean_atomic_radius','entropy_atomic_radius','wtd_entropy_atomic_radius',
'range_atomic_radius','wtd_range_atomic_radius','std_atomic_radius','wtd_std_atomic_radius','mean_Density',
'wtd_mean_Density','gmean_Density','wtd_gmean_Density','entropy_Density','wtd_entropy_Density','range_Density',
'wtd_range_Density','std_Density','wtd_std_Density','mean_ElectronAffinity','wtd_mean_ElectronAffinity',
'gmean_ElectronAffinity','wtd_gmean_ElectronAffinity','entropy_ElectronAffinity','wtd_entropy_ElectronAffinity',
'range_ElectronAffinity','wtd_range_ElectronAffinity','std_ElectronAffinity','wtd_std_ElectronAffinity',
'mean_FusionHeat','wtd_mean_FusionHeat','gmean_FusionHeat','wtd_gmean_FusionHeat','entropy_FusionHeat',
'wtd_entropy_FusionHeat','range_FusionHeat','wtd_range_FusionHeat','std_FusionHeat','wtd_std_FusionHeat',
'mean_ThermalConductivity','wtd_mean_ThermalConductivity','gmean_ThermalConductivity',
'wtd_gmean_ThermalConductivity','entropy_ThermalConductivity','wtd_entropy_ThermalConductivity',
'range_ThermalConductivity','wtd_range_ThermalConductivity','std_ThermalConductivity',
'wtd_std_ThermalConductivity','mean_Valence','wtd_mean_Valence','gmean_Valence','wtd_gmean_Valence',
'entropy_Valence','wtd_entropy_Valence','range_Valence','wtd_range_Valence','std_Valence','wtd_std_Valence']
Outliers
# Histogram of Critical Temperature
fig = px.histogram(merged_df_final, x="critical_temp_x", nbins = 20)
fig.show()
The most interesting detail of note from this table is the overall variation in the distribution of critical temperature. Noting that the range for when a superconductor goes critical can vary greatly between the low of 1 degree celeicus to a high of 144 degrees celicus. In the little over 21,000 observations from the study, only 143 (0.67%) of superconductors reached critical temperatre at 80 degrees celiecus. There does appear to be some outliers in the data as the histogram reveals a right tailed distribution, we may want to logrithmically scale the data to standardize our analysis.
# Box Plot - Critical Temp by Number of Elements (string)
fig = px.box(merged_df_final[merged_df_final['critical_temp_x']==True], x='number_of_elements',
width=800, height=400, title='Box Plot -Number of Elements')
fig.show()
The above box plot indicates that during critical temperature the number of elements is 3 or below with an upper bound of 4 elements contained in the superconductor. There is a degree of outliers at 6 elements contained with the data does exist but the vast majoritiy of elements is three.
summary_stats = merged_df_final.describe()
summary_stats_tb = summary_stats.transpose()
pd.set_option('display.max_columns', None)
pd.set_option('display.float_format', '{:2f}'.format) # Format numeric values
print(summary_stats_tb)
count mean std min 25% \
number_of_elements 21263.000000 4.115224 1.439295 1.000000 3.000000
mean_atomic_mass 21263.000000 87.557631 29.676497 6.941000 72.458076
wtd_mean_atomic_mass 21263.000000 72.988310 33.490406 6.423452 52.143839
gmean_atomic_mass 21263.000000 71.290627 31.030272 5.320573 58.041225
wtd_gmean_atomic_mass 21263.000000 58.539916 36.651067 1.960849 35.248990
... ... ... ... ... ...
Pb 21263.000000 0.042461 0.274365 0.000000 0.000000
Bi 21263.000000 0.201009 0.655927 0.000000 0.000000
Po 21263.000000 0.000000 0.000000 0.000000 0.000000
At 21263.000000 0.000000 0.000000 0.000000 0.000000
Rn 21263.000000 0.000000 0.000000 0.000000 0.000000
50% 75% max
number_of_elements 4.000000 5.000000 9.000000
mean_atomic_mass 84.922750 100.404410 208.980400
wtd_mean_atomic_mass 60.696571 86.103540 208.980400
gmean_atomic_mass 66.361592 78.116681 208.980400
wtd_gmean_atomic_mass 39.918385 73.113234 208.980400
... ... ... ...
Pb 0.000000 0.000000 19.000000
Bi 0.000000 0.000000 14.000000
Po 0.000000 0.000000 0.000000
At 0.000000 0.000000 0.000000
Rn 0.000000 0.000000 0.000000
[168 rows x 8 columns]
summary_stats_m = unique_m.describe()
summary_stats_tb_m = summary_stats_m.transpose()
pd.set_option('display.max_columns', None)
pd.set_option('display.float_format', '{:2f}'.format) # Format numeric values
print(summary_stats_tb_m)
count mean std min 25% 50% \
H 21263.000000 0.017685 0.267220 0.000000 0.000000 0.000000
He 21263.000000 0.000000 0.000000 0.000000 0.000000 0.000000
Li 21263.000000 0.012125 0.129552 0.000000 0.000000 0.000000
Be 21263.000000 0.034638 0.848541 0.000000 0.000000 0.000000
B 21263.000000 0.142594 1.044486 0.000000 0.000000 0.000000
... ... ... ... ... ... ...
Bi 21263.000000 0.201009 0.655927 0.000000 0.000000 0.000000
Po 21263.000000 0.000000 0.000000 0.000000 0.000000 0.000000
At 21263.000000 0.000000 0.000000 0.000000 0.000000 0.000000
Rn 21263.000000 0.000000 0.000000 0.000000 0.000000 0.000000
critical_temp 21263.000000 34.421219 34.254362 0.000210 5.365000 20.000000
75% max
H 0.000000 14.000000
He 0.000000 0.000000
Li 0.000000 3.000000
Be 0.000000 40.000000
B 0.000000 105.000000
... ... ...
Bi 0.000000 14.000000
Po 0.000000 0.000000
At 0.000000 0.000000
Rn 0.000000 0.000000
critical_temp 63.000000 185.000000
[87 rows x 8 columns]
# Cancellation Frequency
crit_temp_df = pd.DataFrame(merged_df_final['critical_temp_x'].value_counts()).reset_index()
crit_temp_df.columns = ['critical_temp', 'Count']
crit_temp_df['Frequency'] = round(crit_temp_df['Count'] / sum(crit_temp_df['Count']) * 100, 2)
crit_temp_df
| critical_temp | Count | Frequency | |
|---|---|---|---|
| 0 | 80.000000 | 143 | 0.670000 |
| 1 | 20.000000 | 129 | 0.610000 |
| 2 | 30.000000 | 125 | 0.590000 |
| 3 | 90.000000 | 122 | 0.570000 |
| 4 | 40.000000 | 111 | 0.520000 |
| ... | ... | ... | ... |
| 3002 | 6.170000 | 1 | 0.000000 |
| 3003 | 4.345000 | 1 | 0.000000 |
| 3004 | 20.460000 | 1 | 0.000000 |
| 3005 | 19.090000 | 1 | 0.000000 |
| 3006 | 122.100000 | 1 | 0.000000 |
3007 rows × 3 columns
Below we see a histogram of the Average Atomic Mass between the superconductor temperatures achiecving critical mass. We see the average atomic mass rapidly rise as we hit the 72 - 78 marker then drop signficantly at around 80 only to rise again at around 88. A gradual decline with many peaks and valleys until the 144 mark where the peaks begin to become noticeably smaller in size and frequency. The bulk of the data as an average of 84.9 atomic mass with a range between 70 - 90.
# Histogram of Atomic mass
fig = px.histogram(merged_df_final, x='mean_atomic_mass', marginal="box",width=800, height=400,
title='Distribution Plot - Average Atomic Mass')
fig.show()
Below we see a Histogram of weighted thermal conductivity. This histogram has a similar shape to that of the distrbution histogram for the average atomic mass. The histogram of thermal conductivity has more of a right tailled skewness but a simliar peaks and valleys to the atomic mass above. We can note rapid growth when thermal conductivity with the vast majority of study indicating a massive peak between a 60 - 62 thermal condicutivty for superconductors with an average of 73.3. It's good to note that the abilitiy of a material to move heat quickly and efficiently, materials with a high thermal conductivity can transfer heat rapidly from one location ot another.
# Histogram by Weighted Thermal Conductivity
fig = px.histogram(merged_df_final, x='wtd_mean_ThermalConductivity',
marginal="box",width=800, height=400, title='Distribution Plot - Thermal Conductivity')
fig.show()
# First step to explore any relationships between data would be to do a correlation
merged_df_final.corr()
| number_of_elements | mean_atomic_mass | wtd_mean_atomic_mass | gmean_atomic_mass | wtd_gmean_atomic_mass | entropy_atomic_mass | wtd_entropy_atomic_mass | range_atomic_mass | wtd_range_atomic_mass | std_atomic_mass | wtd_std_atomic_mass | mean_fie | wtd_mean_fie | gmean_fie | wtd_gmean_fie | entropy_fie | wtd_entropy_fie | range_fie | wtd_range_fie | std_fie | wtd_std_fie | mean_atomic_radius | wtd_mean_atomic_radius | gmean_atomic_radius | wtd_gmean_atomic_radius | entropy_atomic_radius | wtd_entropy_atomic_radius | range_atomic_radius | wtd_range_atomic_radius | std_atomic_radius | wtd_std_atomic_radius | mean_Density | wtd_mean_Density | gmean_Density | wtd_gmean_Density | entropy_Density | wtd_entropy_Density | range_Density | wtd_range_Density | std_Density | wtd_std_Density | mean_ElectronAffinity | wtd_mean_ElectronAffinity | gmean_ElectronAffinity | wtd_gmean_ElectronAffinity | entropy_ElectronAffinity | wtd_entropy_ElectronAffinity | range_ElectronAffinity | wtd_range_ElectronAffinity | std_ElectronAffinity | wtd_std_ElectronAffinity | mean_FusionHeat | wtd_mean_FusionHeat | gmean_FusionHeat | wtd_gmean_FusionHeat | entropy_FusionHeat | wtd_entropy_FusionHeat | range_FusionHeat | wtd_range_FusionHeat | std_FusionHeat | wtd_std_FusionHeat | mean_ThermalConductivity | wtd_mean_ThermalConductivity | gmean_ThermalConductivity | wtd_gmean_ThermalConductivity | entropy_ThermalConductivity | wtd_entropy_ThermalConductivity | range_ThermalConductivity | wtd_range_ThermalConductivity | std_ThermalConductivity | wtd_std_ThermalConductivity | mean_Valence | wtd_mean_Valence | gmean_Valence | wtd_gmean_Valence | entropy_Valence | wtd_entropy_Valence | range_Valence | wtd_range_Valence | std_Valence | wtd_std_Valence | critical_temp_x | H | He | Li | Be | B | C | N | O | F | Ne | Na | Mg | Al | Si | P | S | Cl | Ar | K | Ca | Sc | Ti | V | Cr | Mn | Fe | Co | Ni | Cu | Zn | Ga | Ge | As | Se | Br | Kr | Rb | Sr | Y | Zr | Nb | Mo | Tc | Ru | Rh | Pd | Ag | Cd | In | Sn | Sb | Te | I | Xe | Cs | Ba | La | Ce | Pr | Nd | Pm | Sm | Eu | Gd | Tb | Dy | Ho | Er | Tm | Yb | Lu | Hf | Ta | W | Re | Os | Ir | Pt | Au | Hg | Tl | Pb | Bi | Po | At | Rn | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| number_of_elements | 1.000000 | -0.141923 | -0.353064 | -0.292969 | -0.454525 | 0.939304 | 0.881845 | 0.682777 | -0.320293 | 0.513998 | 0.546391 | 0.167451 | 0.484445 | 0.024229 | 0.424152 | 0.973195 | 0.719209 | 0.781227 | 0.329624 | 0.674005 | 0.717831 | -0.001389 | -0.422144 | -0.240444 | -0.518256 | 0.972245 | 0.904121 | 0.768060 | -0.371350 | 0.624810 | 0.695089 | -0.418675 | -0.507895 | -0.630504 | -0.649882 | 0.871832 | 0.767078 | 0.413486 | -0.355389 | 0.210724 | 0.334072 | -0.119303 | 0.195608 | -0.356067 | -0.052884 | 0.877304 | 0.625798 | 0.531540 | 0.241411 | 0.423738 | 0.480813 | -0.437624 | -0.449272 | -0.514252 | -0.519109 | 0.900759 | 0.860479 | 0.005734 | -0.371788 | -0.113361 | -0.074796 | 0.227656 | 0.206069 | -0.485324 | -0.469206 | 0.501871 | 0.207065 | 0.696060 | 0.316772 | 0.602018 | 0.665580 | -0.609412 | -0.648551 | -0.618512 | -0.659268 | 0.967832 | 0.892559 | 0.231874 | -0.447770 | 0.105365 | 0.035216 | 0.601069 | 0.000950 | NaN | -0.033713 | -0.039801 | -0.069869 | -0.076534 | -0.046001 | 0.541654 | 0.083068 | NaN | -0.016082 | -0.062128 | -0.044387 | -0.069016 | -0.028974 | -0.058567 | 0.015625 | NaN | -0.048214 | 0.260772 | -0.027056 | -0.042975 | -0.065039 | -0.016153 | -0.013550 | 0.023825 | -0.029350 | -0.034140 | 0.416454 | -0.008316 | -0.062019 | -0.053670 | 0.028742 | -0.087924 | -0.002040 | NaN | -0.051874 | 0.475223 | 0.165323 | -0.092858 | -0.075090 | -0.062211 | -0.055049 | -0.041788 | -0.059083 | -0.048377 | -0.023183 | -0.009915 | -0.053554 | -0.055459 | -0.038986 | -0.048162 | 0.018590 | NaN | -0.030937 | 0.288504 | -0.006440 | 0.057134 | 0.010574 | 0.086448 | NaN | 0.059387 | 0.092002 | 0.131634 | -0.004529 | 0.033270 | 0.025838 | 0.038728 | 0.002195 | 0.013471 | -0.022754 | -0.040189 | -0.042999 | -0.053299 | -0.042371 | -0.056916 | -0.049955 | -0.036926 | -0.027695 | 0.116987 | 0.113393 | 0.064947 | 0.249738 | NaN | NaN | NaN |
| mean_atomic_mass | -0.141923 | 1.000000 | 0.815977 | 0.940298 | 0.745841 | -0.104000 | -0.097609 | 0.125659 | 0.446225 | 0.196460 | 0.130675 | -0.285782 | -0.222097 | -0.240565 | -0.219381 | -0.166935 | -0.163565 | -0.255628 | -0.080545 | -0.276561 | -0.222812 | 0.497664 | 0.376760 | 0.561061 | 0.359894 | -0.140034 | -0.147604 | -0.270695 | 0.141100 | -0.326403 | -0.280440 | 0.756861 | 0.608935 | 0.596485 | 0.525588 | -0.043416 | 0.026325 | 0.198067 | 0.342391 | 0.245042 | 0.180943 | 0.088230 | 0.061103 | 0.189282 | 0.134382 | -0.091539 | -0.107651 | -0.187069 | 0.010235 | -0.164960 | -0.133101 | -0.137669 | -0.135429 | 0.014818 | -0.043003 | -0.008499 | -0.028541 | -0.347582 | -0.167528 | -0.337969 | -0.335778 | -0.158266 | -0.065989 | 0.006004 | 0.056394 | -0.100077 | -0.098221 | -0.114538 | -0.027790 | -0.110658 | -0.110856 | 0.374099 | 0.304683 | 0.392153 | 0.321399 | -0.156786 | -0.145610 | -0.107450 | 0.168633 | -0.080279 | -0.081253 | -0.113523 | -0.111504 | NaN | -0.130059 | -0.021678 | -0.120888 | -0.115583 | -0.111170 | -0.093619 | -0.069942 | NaN | -0.115777 | -0.155525 | -0.045626 | -0.018588 | -0.021524 | -0.006956 | -0.112718 | NaN | -0.088519 | -0.041040 | -0.014483 | -0.026578 | -0.052516 | -0.010145 | -0.016231 | -0.076634 | -0.020210 | -0.066038 | -0.070633 | 0.020962 | -0.031919 | 0.049963 | -0.022760 | 0.028460 | -0.043488 | NaN | -0.055412 | -0.003923 | -0.126102 | -0.008436 | -0.029126 | -0.004621 | 0.009937 | 0.017572 | 0.036436 | 0.016032 | -0.014841 | 0.001028 | 0.106627 | 0.054920 | 0.056644 | 0.084814 | 0.001330 | NaN | -0.000976 | -0.044394 | 0.056738 | 0.083521 | 0.007759 | 0.010153 | NaN | -0.003401 | 0.033260 | 0.030458 | 0.028056 | 0.020760 | 0.016547 | 0.032158 | 0.028773 | 0.038125 | 0.083665 | 0.045299 | 0.061727 | 0.053075 | 0.063347 | 0.136228 | 0.089754 | 0.107023 | 0.055121 | 0.101282 | 0.113666 | 0.197505 | 0.121567 | NaN | NaN | NaN |
| wtd_mean_atomic_mass | -0.353064 | 0.815977 | 1.000000 | 0.848242 | 0.964085 | -0.308046 | -0.412666 | -0.144029 | 0.716623 | -0.060739 | -0.089471 | -0.209296 | -0.522595 | -0.109490 | -0.508109 | -0.369773 | -0.129779 | -0.452303 | -0.420457 | -0.459323 | -0.492250 | 0.288451 | 0.660011 | 0.468457 | 0.667112 | -0.345071 | -0.400483 | -0.524861 | 0.363882 | -0.551141 | -0.554820 | 0.749261 | 0.842665 | 0.712815 | 0.767011 | -0.246377 | -0.195894 | -0.002868 | 0.585687 | 0.103157 | 0.009921 | 0.147303 | -0.096427 | 0.272261 | 0.021877 | -0.290220 | -0.093796 | -0.225890 | -0.204480 | -0.197729 | -0.210757 | 0.006730 | 0.014681 | 0.164239 | 0.120044 | -0.225287 | -0.237218 | -0.283420 | -0.070411 | -0.253911 | -0.272806 | -0.236418 | -0.058075 | 0.184990 | 0.250226 | -0.076936 | 0.025638 | -0.376573 | -0.108512 | -0.362512 | -0.350993 | 0.534450 | 0.545587 | 0.539780 | 0.548981 | -0.375718 | -0.331025 | -0.039155 | 0.330904 | -0.003681 | 0.077323 | -0.312272 | -0.087297 | NaN | -0.064887 | -0.051196 | -0.109217 | -0.136420 | -0.062879 | -0.416796 | -0.027159 | NaN | -0.090717 | -0.102286 | -0.033009 | -0.029212 | -0.011657 | -0.013994 | -0.067755 | NaN | -0.071718 | -0.080540 | -0.002800 | -0.020755 | -0.032313 | -0.004656 | 0.000862 | -0.018524 | 0.008220 | -0.024193 | -0.229718 | -0.001851 | -0.000182 | 0.047000 | 0.000950 | 0.052034 | -0.020349 | NaN | -0.071687 | -0.092916 | -0.174097 | 0.028848 | 0.034874 | 0.019095 | 0.025377 | 0.027995 | 0.044557 | 0.050368 | 0.005236 | 0.009345 | 0.124607 | 0.077407 | 0.078442 | 0.096987 | 0.016405 | NaN | -0.033085 | -0.220160 | 0.043506 | 0.074608 | 0.010219 | -0.039916 | NaN | -0.019019 | -0.022564 | -0.035797 | 0.015752 | -0.017355 | -0.020333 | -0.010473 | 0.009757 | 0.023149 | 0.053253 | 0.052581 | 0.068113 | 0.064520 | 0.080614 | 0.124673 | 0.098004 | 0.111642 | 0.067790 | 0.037561 | 0.056807 | 0.169794 | 0.115858 | NaN | NaN | NaN |
| gmean_atomic_mass | -0.292969 | 0.940298 | 0.848242 | 1.000000 | 0.856975 | -0.190214 | -0.232183 | -0.175861 | 0.458473 | -0.121708 | -0.166042 | -0.367690 | -0.354664 | -0.286844 | -0.341585 | -0.316670 | -0.287701 | -0.431689 | -0.155439 | -0.450045 | -0.390843 | 0.510867 | 0.488822 | 0.647560 | 0.496461 | -0.282048 | -0.311701 | -0.460197 | 0.240296 | -0.512841 | -0.462397 | 0.779757 | 0.677131 | 0.728477 | 0.663642 | -0.125672 | -0.093881 | -0.024975 | 0.368143 | 0.037866 | -0.037299 | 0.079376 | -0.006353 | 0.219651 | 0.111858 | -0.238002 | -0.224757 | -0.284098 | -0.055316 | -0.249926 | -0.238028 | -0.092244 | -0.089139 | 0.086599 | 0.024199 | -0.126798 | -0.171928 | -0.384838 | -0.128758 | -0.361244 | -0.372666 | -0.190937 | -0.104940 | 0.110769 | 0.131642 | -0.116455 | -0.104644 | -0.243465 | -0.095661 | -0.233587 | -0.232079 | 0.487021 | 0.427961 | 0.511508 | 0.450357 | -0.306246 | -0.307662 | -0.165010 | 0.272303 | -0.124627 | -0.117336 | -0.230345 | -0.119243 | NaN | -0.138087 | -0.041760 | -0.145190 | -0.114904 | -0.109134 | -0.193470 | -0.086346 | NaN | -0.100417 | -0.136540 | -0.032450 | -0.021309 | -0.014900 | -0.017244 | -0.089040 | NaN | -0.081515 | -0.078359 | -0.012664 | -0.009310 | -0.024013 | -0.003682 | -0.005250 | -0.033847 | -0.001253 | -0.053141 | -0.132853 | 0.025385 | -0.002493 | 0.068915 | 0.000392 | 0.065670 | -0.037552 | NaN | -0.052847 | -0.072322 | -0.115229 | 0.025199 | -0.000178 | 0.009791 | 0.024807 | 0.033432 | 0.043205 | 0.033350 | -0.010431 | 0.003324 | 0.137184 | 0.076803 | 0.076529 | 0.103842 | -0.003148 | NaN | -0.011229 | -0.111109 | 0.063909 | 0.087015 | 0.003674 | -0.025485 | NaN | -0.025630 | 0.016533 | 0.003667 | 0.022969 | -0.005685 | -0.014925 | 0.005598 | 0.017337 | 0.026523 | 0.052581 | 0.043650 | 0.063578 | 0.018147 | 0.062554 | 0.140039 | 0.074542 | 0.091775 | 0.056722 | 0.032206 | 0.060171 | 0.169211 | 0.034077 | NaN | NaN | NaN |
| wtd_gmean_atomic_mass | -0.454525 | 0.745841 | 0.964085 | 0.856975 | 1.000000 | -0.370561 | -0.484664 | -0.352093 | 0.673326 | -0.274487 | -0.331657 | -0.276668 | -0.612317 | -0.154323 | -0.588014 | -0.471280 | -0.227652 | -0.575369 | -0.451326 | -0.578719 | -0.617363 | 0.301508 | 0.720901 | 0.527074 | 0.749593 | -0.441916 | -0.514618 | -0.645663 | 0.432896 | -0.665166 | -0.681130 | 0.740131 | 0.852608 | 0.789208 | 0.843708 | -0.300078 | -0.273122 | -0.163939 | 0.576836 | -0.048110 | -0.174098 | 0.119314 | -0.158608 | 0.274209 | -0.011941 | -0.395866 | -0.194127 | -0.300521 | -0.246388 | -0.262822 | -0.291153 | 0.054752 | 0.072658 | 0.219751 | 0.189353 | -0.313735 | -0.349844 | -0.279300 | -0.006009 | -0.239074 | -0.277760 | -0.248849 | -0.056793 | 0.271083 | 0.322335 | -0.076157 | 0.020495 | -0.464856 | -0.129212 | -0.447236 | -0.431027 | 0.599413 | 0.614100 | 0.608417 | 0.623261 | -0.477785 | -0.448072 | -0.078641 | 0.409674 | -0.033313 | 0.030361 | -0.369858 | -0.081418 | NaN | -0.077289 | -0.042417 | -0.114284 | -0.101973 | -0.062112 | -0.475707 | -0.039707 | NaN | -0.064887 | -0.085839 | -0.017499 | -0.023541 | -0.004654 | -0.017157 | -0.049457 | NaN | -0.056773 | -0.080323 | -0.000193 | -0.003153 | -0.007198 | 0.000615 | 0.008567 | 0.021970 | 0.022549 | -0.011455 | -0.257135 | 0.004203 | 0.022255 | 0.060508 | 0.028005 | 0.081196 | -0.017422 | NaN | -0.050966 | -0.136241 | -0.166371 | 0.051628 | 0.058673 | 0.032422 | 0.035285 | 0.038298 | 0.047940 | 0.061469 | 0.006435 | 0.011334 | 0.144345 | 0.091147 | 0.090722 | 0.106489 | 0.007775 | NaN | -0.020512 | -0.249967 | 0.040336 | 0.072340 | 0.008961 | -0.068187 | NaN | -0.036996 | -0.030590 | -0.049162 | 0.015089 | -0.030796 | -0.036411 | -0.025176 | 0.003907 | 0.018494 | 0.022324 | 0.048340 | 0.066058 | 0.038396 | 0.078045 | 0.121542 | 0.074275 | 0.092166 | 0.066047 | -0.006049 | 0.019589 | 0.148454 | 0.029298 | NaN | NaN | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| Pb | 0.064947 | 0.197505 | 0.169794 | 0.169211 | 0.148454 | 0.046266 | 0.048743 | 0.101324 | 0.101449 | 0.103382 | 0.070390 | -0.043793 | -0.038166 | -0.032042 | -0.035478 | 0.047307 | 0.053384 | -0.020137 | -0.029629 | -0.039860 | -0.028675 | -0.002635 | 0.019809 | 0.023962 | 0.028374 | 0.053665 | 0.042746 | -0.038289 | -0.015530 | -0.066456 | -0.051666 | 0.063647 | 0.056537 | 0.043781 | 0.044145 | 0.062639 | 0.081242 | 0.021400 | 0.023113 | 0.017309 | 0.007919 | -0.057314 | -0.043074 | -0.052592 | -0.061273 | 0.035930 | 0.023695 | -0.001974 | -0.029124 | -0.009704 | 0.018760 | -0.083310 | -0.070968 | -0.070125 | -0.064741 | 0.056560 | 0.060450 | -0.052320 | -0.059616 | -0.054728 | -0.053379 | -0.034022 | -0.007171 | -0.026512 | -0.026505 | 0.070847 | 0.049838 | 0.005455 | -0.013186 | -0.011324 | 0.003938 | 0.019358 | 0.012676 | 0.018126 | 0.013143 | 0.041077 | 0.050627 | 0.032337 | -0.005013 | 0.028546 | 0.023821 | 0.016864 | -0.007610 | NaN | -0.013029 | -0.006293 | -0.020985 | -0.012727 | -0.012129 | -0.005860 | -0.003560 | NaN | -0.006636 | -0.012837 | -0.008287 | -0.013228 | -0.008855 | 0.053905 | -0.011700 | NaN | -0.015719 | 0.055194 | -0.008142 | -0.008092 | -0.009833 | -0.001828 | -0.002963 | -0.033166 | -0.009377 | -0.014192 | 0.008438 | -0.005043 | -0.009751 | -0.012500 | -0.021840 | 0.051451 | -0.007267 | NaN | -0.009955 | 0.135993 | -0.026217 | -0.011208 | -0.013605 | 0.013636 | -0.005478 | -0.011000 | 0.018651 | -0.006152 | 0.030714 | -0.000280 | -0.009717 | -0.008342 | -0.007892 | 0.009317 | -0.008105 | NaN | 0.000535 | -0.077901 | -0.006656 | -0.015474 | -0.003790 | -0.016002 | NaN | -0.010050 | 0.015950 | -0.016513 | -0.001295 | -0.007175 | 0.000810 | -0.008278 | -0.007613 | -0.000982 | -0.013492 | -0.006790 | -0.003737 | -0.009643 | -0.005022 | -0.012343 | -0.011016 | -0.014156 | 0.000110 | -0.002473 | 0.008468 | 1.000000 | 0.067402 | NaN | NaN | NaN |
| Bi | 0.249738 | 0.121567 | 0.115858 | 0.034077 | 0.029298 | 0.167009 | 0.155360 | 0.381924 | 0.088625 | 0.341030 | 0.371978 | 0.028473 | 0.048413 | 0.014709 | 0.041231 | 0.218048 | 0.234481 | 0.148581 | -0.024473 | 0.105363 | 0.112067 | -0.110631 | -0.092406 | -0.113207 | -0.084334 | 0.223312 | 0.241132 | 0.090336 | -0.152969 | 0.033068 | 0.041451 | -0.073299 | -0.073785 | -0.129492 | -0.118404 | 0.165771 | 0.221612 | 0.067845 | -0.101047 | 0.062044 | 0.082554 | 0.038165 | 0.059036 | -0.046882 | -0.037509 | 0.207058 | 0.210033 | 0.105201 | -0.001571 | 0.073922 | 0.104452 | -0.161109 | -0.150656 | -0.148619 | -0.145667 | 0.241057 | 0.294931 | -0.111558 | -0.148439 | -0.128617 | -0.125032 | 0.037179 | -0.026681 | -0.145483 | -0.129927 | 0.100508 | 0.178005 | 0.154591 | -0.034808 | 0.124353 | 0.102590 | -0.059411 | -0.061718 | -0.086916 | -0.089660 | 0.186299 | 0.215864 | 0.274992 | -0.086750 | 0.248179 | 0.320986 | 0.162499 | -0.017418 | NaN | -0.019683 | -0.012510 | -0.041839 | -0.025968 | -0.024379 | 0.171170 | 0.028434 | NaN | -0.016969 | -0.027832 | -0.016781 | -0.026202 | -0.018373 | 0.036544 | -0.021419 | NaN | 0.013228 | 0.235104 | -0.017908 | -0.017290 | -0.019883 | -0.005678 | -0.004555 | -0.064186 | -0.017051 | -0.023564 | 0.081941 | -0.008636 | -0.019566 | -0.024759 | -0.044066 | -0.003520 | -0.014390 | NaN | -0.015667 | 0.535282 | -0.093645 | -0.023403 | -0.027769 | -0.021246 | -0.010847 | -0.021787 | -0.012459 | -0.015026 | -0.006477 | -0.003714 | -0.017900 | -0.019085 | -0.016092 | 0.015407 | 0.047880 | NaN | 0.007103 | -0.166755 | -0.025539 | -0.019751 | -0.006792 | -0.025352 | NaN | -0.026261 | -0.009927 | -0.031803 | -0.009164 | -0.022177 | -0.015801 | -0.024034 | -0.005980 | -0.014690 | -0.028373 | -0.013137 | -0.012258 | -0.019147 | -0.009822 | -0.024442 | -0.021150 | -0.031599 | -0.007015 | -0.049317 | -0.048545 | 0.067402 | 1.000000 | NaN | NaN | NaN |
| Po | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| At | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Rn | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
168 rows × 168 columns
# calculate the correlations
correlations = merged_df_final.corr()
# list of all features except the target
all_features = merged_df_final.columns.tolist()
target_variable = 'critical_temp_x'
all_features.remove(target_variable)
#Define the number of features per group
features_per_group = 4
total_features = len(all_features)
total_groups = (total_features + features_per_group - 1) // features_per_group
# Iterate through groups
for group_num in range(total_groups):
start_idx = group_num * features_per_group
end_idx = min((group_num + 1) * features_per_group, total_features)
selected_features = all_features[start_idx:end_idx]
#Create a new figure for each group
plt.figure(figsize=(15,10))
for idx, feature in enumerate(selected_features):
plt.subplot(2, 4, idx + 1)
sns.scatterplot(data=merged_df_final, x=feature, y=target_variable)
# Calcualte correlation coefficient
corr_coeff = correlations.loc[feature, target_variable]
#Annoate twith correlation coefficients
plt.text(0.5, 0.9, f'Corr: {corr_coeff: .2f}', horizontalalignment='center',
verticalalignment='center',transform=plt.gca().transAxes, fontsize=10)
plt.title(f"Scatter Plot: {target_variable} vs {feature}", fontsize = 9) #adjust title size here
plt.tight_layout()
plt.subplots_adjust(top=0.9) #Adjust top spacing for the overall title
plt.suptitle(f"Correlation Plots - Group {group_num+1}", fontsize = 16) # Overalltitle
plt.show()
The correlation plot grants insight into what features might have some type of relationship among our response variable critical temp. As we previously explored, there is a positive correlation between the number of elements and the critical temperature. We can note that there is a strong postive corrleation of 0.72 between critical temperature and the weighted standard deviation of Thermal Conductivity, indicating an increase in critical temperature we would have an increase in the weighted standard deviation of Thermal Conductivity. Of the elements featured in the study we see that Ba (Barium) & O (Oxygen) have a moderate correlation for critical temperature at 0.56 & 0.57 respectively. There is a moderate negative correlation of -0.62 between critical temperature and the weighted g-average of valence.
correlations = merged_df_final.corr()
#set the correlation threshold
threshold = 0.90
#Create empty lists to store strong relationships
strong_pos_corr = []
strong_neg_corr = []
# Iterate through the correlation matrix
for feature1 in correlations.columns:
for feature2 in correlations.index:
if feature1 !=feature2: # Avoid comparing a feature with itself
corr_value = correlations.loc[feature2, feature1]
if corr_value > threshold:
strong_pos_corr.append((feature2, feature1, corr_value))
elif corr_value < -threshold:
strong_neg_corr.append((feature2, feature1, corr_value))
# Sort the correlations alphabetically
strong_pos_corr.sort()
strong_neg_corr.sort()
# Format results as table
pos_table = tabulate(strong_pos_corr, headers=["Feature 1", "Feature 2", "Correlation"], tablefmt="grid")
neg_table = tabulate(strong_neg_corr, headers=["Feature 1", "Feature 2", "Correlation"], tablefmt="grid")
#Display strong pos correlations
print("Strong Positive Correlations:")
print(pos_table)
print()
#Display strong pos correlations
print("Strong Negative Correlations:")
print(neg_table)
Strong Positive Correlations: +-----------------------------+-----------------------------+---------------+ | Feature 1 | Feature 2 | Correlation | +=============================+=============================+===============+ | entropy_Density | entropy_FusionHeat | 0.917732 | +-----------------------------+-----------------------------+---------------+ | entropy_Density | entropy_Valence | 0.900579 | +-----------------------------+-----------------------------+---------------+ | entropy_Density | entropy_atomic_mass | 0.932668 | +-----------------------------+-----------------------------+---------------+ | entropy_Density | entropy_atomic_radius | 0.91555 | +-----------------------------+-----------------------------+---------------+ | entropy_Density | entropy_fie | 0.902037 | +-----------------------------+-----------------------------+---------------+ | entropy_ElectronAffinity | entropy_Valence | 0.904659 | +-----------------------------+-----------------------------+---------------+ | entropy_ElectronAffinity | entropy_atomic_radius | 0.909744 | +-----------------------------+-----------------------------+---------------+ | entropy_ElectronAffinity | entropy_fie | 0.912862 | +-----------------------------+-----------------------------+---------------+ | entropy_FusionHeat | entropy_Density | 0.917732 | +-----------------------------+-----------------------------+---------------+ | entropy_FusionHeat | entropy_Valence | 0.921445 | +-----------------------------+-----------------------------+---------------+ | entropy_FusionHeat | entropy_atomic_mass | 0.928251 | +-----------------------------+-----------------------------+---------------+ | entropy_FusionHeat | entropy_atomic_radius | 0.930294 | +-----------------------------+-----------------------------+---------------+ | entropy_FusionHeat | entropy_fie | 0.916592 | +-----------------------------+-----------------------------+---------------+ | entropy_FusionHeat | number_of_elements | 0.900759 | +-----------------------------+-----------------------------+---------------+ | entropy_Valence | entropy_Density | 0.900579 | +-----------------------------+-----------------------------+---------------+ | entropy_Valence | entropy_ElectronAffinity | 0.904659 | +-----------------------------+-----------------------------+---------------+ | entropy_Valence | entropy_FusionHeat | 0.921445 | +-----------------------------+-----------------------------+---------------+ | entropy_Valence | entropy_atomic_mass | 0.963621 | +-----------------------------+-----------------------------+---------------+ | entropy_Valence | entropy_atomic_radius | 0.989546 | +-----------------------------+-----------------------------+---------------+ | entropy_Valence | entropy_fie | 0.992726 | +-----------------------------+-----------------------------+---------------+ | entropy_Valence | number_of_elements | 0.967832 | +-----------------------------+-----------------------------+---------------+ | entropy_Valence | wtd_entropy_Valence | 0.910822 | +-----------------------------+-----------------------------+---------------+ | entropy_Valence | wtd_entropy_atomic_radius | 0.919184 | +-----------------------------+-----------------------------+---------------+ | entropy_atomic_mass | entropy_Density | 0.932668 | +-----------------------------+-----------------------------+---------------+ | entropy_atomic_mass | entropy_FusionHeat | 0.928251 | +-----------------------------+-----------------------------+---------------+ | entropy_atomic_mass | entropy_Valence | 0.963621 | +-----------------------------+-----------------------------+---------------+ | entropy_atomic_mass | entropy_atomic_radius | 0.972329 | +-----------------------------+-----------------------------+---------------+ | entropy_atomic_mass | entropy_fie | 0.964695 | +-----------------------------+-----------------------------+---------------+ | entropy_atomic_mass | number_of_elements | 0.939304 | +-----------------------------+-----------------------------+---------------+ | entropy_atomic_radius | entropy_Density | 0.91555 | +-----------------------------+-----------------------------+---------------+ | entropy_atomic_radius | entropy_ElectronAffinity | 0.909744 | +-----------------------------+-----------------------------+---------------+ | entropy_atomic_radius | entropy_FusionHeat | 0.930294 | +-----------------------------+-----------------------------+---------------+ | entropy_atomic_radius | entropy_Valence | 0.989546 | +-----------------------------+-----------------------------+---------------+ | entropy_atomic_radius | entropy_atomic_mass | 0.972329 | +-----------------------------+-----------------------------+---------------+ | entropy_atomic_radius | entropy_fie | 0.997739 | +-----------------------------+-----------------------------+---------------+ | entropy_atomic_radius | number_of_elements | 0.972245 | +-----------------------------+-----------------------------+---------------+ | entropy_atomic_radius | wtd_entropy_atomic_radius | 0.914223 | +-----------------------------+-----------------------------+---------------+ | entropy_fie | entropy_Density | 0.902037 | +-----------------------------+-----------------------------+---------------+ | entropy_fie | entropy_ElectronAffinity | 0.912862 | +-----------------------------+-----------------------------+---------------+ | entropy_fie | entropy_FusionHeat | 0.916592 | +-----------------------------+-----------------------------+---------------+ | entropy_fie | entropy_Valence | 0.992726 | +-----------------------------+-----------------------------+---------------+ | entropy_fie | entropy_atomic_mass | 0.964695 | +-----------------------------+-----------------------------+---------------+ | entropy_fie | entropy_atomic_radius | 0.997739 | +-----------------------------+-----------------------------+---------------+ | entropy_fie | number_of_elements | 0.973195 | +-----------------------------+-----------------------------+---------------+ | entropy_fie | wtd_entropy_Valence | 0.907923 | +-----------------------------+-----------------------------+---------------+ | entropy_fie | wtd_entropy_atomic_radius | 0.920192 | +-----------------------------+-----------------------------+---------------+ | gmean_Density | wtd_gmean_Density | 0.951995 | +-----------------------------+-----------------------------+---------------+ | gmean_FusionHeat | mean_FusionHeat | 0.926769 | +-----------------------------+-----------------------------+---------------+ | gmean_Valence | mean_Valence | 0.989911 | +-----------------------------+-----------------------------+---------------+ | gmean_Valence | wtd_gmean_Valence | 0.933036 | +-----------------------------+-----------------------------+---------------+ | gmean_Valence | wtd_mean_Valence | 0.917905 | +-----------------------------+-----------------------------+---------------+ | gmean_atomic_mass | mean_atomic_mass | 0.940298 | +-----------------------------+-----------------------------+---------------+ | gmean_atomic_radius | mean_atomic_radius | 0.915931 | +-----------------------------+-----------------------------+---------------+ | gmean_fie | mean_fie | 0.969325 | +-----------------------------+-----------------------------+---------------+ | mean_FusionHeat | gmean_FusionHeat | 0.926769 | +-----------------------------+-----------------------------+---------------+ | mean_FusionHeat | wtd_mean_FusionHeat | 0.909575 | +-----------------------------+-----------------------------+---------------+ | mean_Valence | gmean_Valence | 0.989911 | +-----------------------------+-----------------------------+---------------+ | mean_Valence | wtd_gmean_Valence | 0.940001 | +-----------------------------+-----------------------------+---------------+ | mean_Valence | wtd_mean_Valence | 0.937103 | +-----------------------------+-----------------------------+---------------+ | mean_atomic_mass | gmean_atomic_mass | 0.940298 | +-----------------------------+-----------------------------+---------------+ | mean_atomic_radius | gmean_atomic_radius | 0.915931 | +-----------------------------+-----------------------------+---------------+ | mean_fie | gmean_fie | 0.969325 | +-----------------------------+-----------------------------+---------------+ | number_of_elements | entropy_FusionHeat | 0.900759 | +-----------------------------+-----------------------------+---------------+ | number_of_elements | entropy_Valence | 0.967832 | +-----------------------------+-----------------------------+---------------+ | number_of_elements | entropy_atomic_mass | 0.939304 | +-----------------------------+-----------------------------+---------------+ | number_of_elements | entropy_atomic_radius | 0.972245 | +-----------------------------+-----------------------------+---------------+ | number_of_elements | entropy_fie | 0.973195 | +-----------------------------+-----------------------------+---------------+ | number_of_elements | wtd_entropy_atomic_radius | 0.904121 | +-----------------------------+-----------------------------+---------------+ | range_Density | std_Density | 0.959956 | +-----------------------------+-----------------------------+---------------+ | range_Density | wtd_std_Density | 0.907307 | +-----------------------------+-----------------------------+---------------+ | range_ElectronAffinity | std_ElectronAffinity | 0.973114 | +-----------------------------+-----------------------------+---------------+ | range_FusionHeat | std_FusionHeat | 0.984574 | +-----------------------------+-----------------------------+---------------+ | range_FusionHeat | wtd_std_FusionHeat | 0.925642 | +-----------------------------+-----------------------------+---------------+ | range_ThermalConductivity | std_ThermalConductivity | 0.987867 | +-----------------------------+-----------------------------+---------------+ | range_ThermalConductivity | wtd_std_ThermalConductivity | 0.965449 | +-----------------------------+-----------------------------+---------------+ | range_Valence | std_Valence | 0.973788 | +-----------------------------+-----------------------------+---------------+ | range_atomic_mass | std_atomic_mass | 0.960854 | +-----------------------------+-----------------------------+---------------+ | range_atomic_mass | wtd_std_atomic_mass | 0.918152 | +-----------------------------+-----------------------------+---------------+ | range_atomic_radius | range_fie | 0.908734 | +-----------------------------+-----------------------------+---------------+ | range_atomic_radius | std_atomic_radius | 0.967428 | +-----------------------------+-----------------------------+---------------+ | range_atomic_radius | wtd_std_atomic_radius | 0.958004 | +-----------------------------+-----------------------------+---------------+ | range_fie | range_atomic_radius | 0.908734 | +-----------------------------+-----------------------------+---------------+ | range_fie | std_fie | 0.981628 | +-----------------------------+-----------------------------+---------------+ | range_fie | wtd_std_fie | 0.940281 | +-----------------------------+-----------------------------+---------------+ | std_Density | range_Density | 0.959956 | +-----------------------------+-----------------------------+---------------+ | std_Density | wtd_std_Density | 0.905669 | +-----------------------------+-----------------------------+---------------+ | std_ElectronAffinity | range_ElectronAffinity | 0.973114 | +-----------------------------+-----------------------------+---------------+ | std_FusionHeat | range_FusionHeat | 0.984574 | +-----------------------------+-----------------------------+---------------+ | std_FusionHeat | wtd_std_FusionHeat | 0.940183 | +-----------------------------+-----------------------------+---------------+ | std_ThermalConductivity | range_ThermalConductivity | 0.987867 | +-----------------------------+-----------------------------+---------------+ | std_ThermalConductivity | wtd_std_ThermalConductivity | 0.955627 | +-----------------------------+-----------------------------+---------------+ | std_Valence | range_Valence | 0.973788 | +-----------------------------+-----------------------------+---------------+ | std_atomic_mass | range_atomic_mass | 0.960854 | +-----------------------------+-----------------------------+---------------+ | std_atomic_mass | wtd_std_atomic_mass | 0.919788 | +-----------------------------+-----------------------------+---------------+ | std_atomic_radius | range_atomic_radius | 0.967428 | +-----------------------------+-----------------------------+---------------+ | std_atomic_radius | wtd_std_atomic_radius | 0.944536 | +-----------------------------+-----------------------------+---------------+ | std_fie | range_fie | 0.981628 | +-----------------------------+-----------------------------+---------------+ | std_fie | wtd_std_fie | 0.934255 | +-----------------------------+-----------------------------+---------------+ | wtd_entropy_FusionHeat | wtd_entropy_Valence | 0.908728 | +-----------------------------+-----------------------------+---------------+ | wtd_entropy_FusionHeat | wtd_entropy_atomic_radius | 0.90786 | +-----------------------------+-----------------------------+---------------+ | wtd_entropy_Valence | entropy_Valence | 0.910822 | +-----------------------------+-----------------------------+---------------+ | wtd_entropy_Valence | entropy_fie | 0.907923 | +-----------------------------+-----------------------------+---------------+ | wtd_entropy_Valence | wtd_entropy_FusionHeat | 0.908728 | +-----------------------------+-----------------------------+---------------+ | wtd_entropy_Valence | wtd_entropy_atomic_mass | 0.918284 | +-----------------------------+-----------------------------+---------------+ | wtd_entropy_Valence | wtd_entropy_atomic_radius | 0.951463 | +-----------------------------+-----------------------------+---------------+ | wtd_entropy_atomic_mass | wtd_entropy_Valence | 0.918284 | +-----------------------------+-----------------------------+---------------+ | wtd_entropy_atomic_mass | wtd_entropy_atomic_radius | 0.961464 | +-----------------------------+-----------------------------+---------------+ | wtd_entropy_atomic_radius | entropy_Valence | 0.919184 | +-----------------------------+-----------------------------+---------------+ | wtd_entropy_atomic_radius | entropy_atomic_radius | 0.914223 | +-----------------------------+-----------------------------+---------------+ | wtd_entropy_atomic_radius | entropy_fie | 0.920192 | +-----------------------------+-----------------------------+---------------+ | wtd_entropy_atomic_radius | number_of_elements | 0.904121 | +-----------------------------+-----------------------------+---------------+ | wtd_entropy_atomic_radius | wtd_entropy_FusionHeat | 0.90786 | +-----------------------------+-----------------------------+---------------+ | wtd_entropy_atomic_radius | wtd_entropy_Valence | 0.951463 | +-----------------------------+-----------------------------+---------------+ | wtd_entropy_atomic_radius | wtd_entropy_atomic_mass | 0.961464 | +-----------------------------+-----------------------------+---------------+ | wtd_gmean_Density | gmean_Density | 0.951995 | +-----------------------------+-----------------------------+---------------+ | wtd_gmean_Density | wtd_mean_Density | 0.941502 | +-----------------------------+-----------------------------+---------------+ | wtd_gmean_FusionHeat | wtd_mean_FusionHeat | 0.970948 | +-----------------------------+-----------------------------+---------------+ | wtd_gmean_Valence | gmean_Valence | 0.933036 | +-----------------------------+-----------------------------+---------------+ | wtd_gmean_Valence | mean_Valence | 0.940001 | +-----------------------------+-----------------------------+---------------+ | wtd_gmean_Valence | wtd_mean_Valence | 0.994939 | +-----------------------------+-----------------------------+---------------+ | wtd_gmean_atomic_mass | wtd_mean_atomic_mass | 0.964085 | +-----------------------------+-----------------------------+---------------+ | wtd_gmean_atomic_radius | wtd_mean_atomic_radius | 0.980107 | +-----------------------------+-----------------------------+---------------+ | wtd_gmean_fie | wtd_mean_fie | 0.992331 | +-----------------------------+-----------------------------+---------------+ | wtd_mean_Density | wtd_gmean_Density | 0.941502 | +-----------------------------+-----------------------------+---------------+ | wtd_mean_FusionHeat | mean_FusionHeat | 0.909575 | +-----------------------------+-----------------------------+---------------+ | wtd_mean_FusionHeat | wtd_gmean_FusionHeat | 0.970948 | +-----------------------------+-----------------------------+---------------+ | wtd_mean_Valence | gmean_Valence | 0.917905 | +-----------------------------+-----------------------------+---------------+ | wtd_mean_Valence | mean_Valence | 0.937103 | +-----------------------------+-----------------------------+---------------+ | wtd_mean_Valence | wtd_gmean_Valence | 0.994939 | +-----------------------------+-----------------------------+---------------+ | wtd_mean_atomic_mass | wtd_gmean_atomic_mass | 0.964085 | +-----------------------------+-----------------------------+---------------+ | wtd_mean_atomic_radius | wtd_gmean_atomic_radius | 0.980107 | +-----------------------------+-----------------------------+---------------+ | wtd_mean_fie | wtd_gmean_fie | 0.992331 | +-----------------------------+-----------------------------+---------------+ | wtd_std_Density | range_Density | 0.907307 | +-----------------------------+-----------------------------+---------------+ | wtd_std_Density | std_Density | 0.905669 | +-----------------------------+-----------------------------+---------------+ | wtd_std_FusionHeat | range_FusionHeat | 0.925642 | +-----------------------------+-----------------------------+---------------+ | wtd_std_FusionHeat | std_FusionHeat | 0.940183 | +-----------------------------+-----------------------------+---------------+ | wtd_std_ThermalConductivity | range_ThermalConductivity | 0.965449 | +-----------------------------+-----------------------------+---------------+ | wtd_std_ThermalConductivity | std_ThermalConductivity | 0.955627 | +-----------------------------+-----------------------------+---------------+ | wtd_std_atomic_mass | range_atomic_mass | 0.918152 | +-----------------------------+-----------------------------+---------------+ | wtd_std_atomic_mass | std_atomic_mass | 0.919788 | +-----------------------------+-----------------------------+---------------+ | wtd_std_atomic_radius | range_atomic_radius | 0.958004 | +-----------------------------+-----------------------------+---------------+ | wtd_std_atomic_radius | std_atomic_radius | 0.944536 | +-----------------------------+-----------------------------+---------------+ | wtd_std_atomic_radius | wtd_std_fie | 0.922258 | +-----------------------------+-----------------------------+---------------+ | wtd_std_fie | range_fie | 0.940281 | +-----------------------------+-----------------------------+---------------+ | wtd_std_fie | std_fie | 0.934255 | +-----------------------------+-----------------------------+---------------+ | wtd_std_fie | wtd_std_atomic_radius | 0.922258 | +-----------------------------+-----------------------------+---------------+ Strong Negative Correlations: +-------------------------+-------------------------+---------------+ | Feature 1 | Feature 2 | Correlation | +=========================+=========================+===============+ | wtd_gmean_atomic_radius | wtd_mean_fie | -0.914255 | +-------------------------+-------------------------+---------------+ | wtd_mean_fie | wtd_gmean_atomic_radius | -0.914255 | +-------------------------+-------------------------+---------------+
From the correlation table, we can see that the entropy of all of the features contained within the model are extremely positively correlated with all of the other entropy related features. Of the entropy's the strongest correlation is between the entropy of Valence and the entropy of FIE at 0.993 The only two very strong negative correlation that we observe within the study are between the weighted geometric mean of atomic radius and the weighted average of FIE. It makes logicial sense that the features of a particular subgroup are interrelated with a strong correlation to one another.
#split the datasets into x and y
data_x = merged_df_final.drop(columns='critical_temp_x')
data_y = merged_df_final[['critical_temp_x']]
#normilize the dataset
scaler = MinMaxScaler()
norm_y = scaler.fit_transform(data_y)
#split the data into trian and test datasets
x_train, x_test, y_train, y_test = train_test_split(data_x, norm_y, test_size=0.2, random_state=10)
#train the model
lasso_model = Lasso(alpha=0.01, max_iter=100000)
lasso_model.fit(x_train,y_train)
#test the model
y_pred = lasso_model.predict(x_test)
#find the mse of the test
#mse = mean_squared_error(y_test, y_pred)
#print(f"The MSE is {mse}")
y_test_actual = scaler.inverse_transform(y_test.reshape(-1, 1)) # Inverse transform normalized y_test to original scale
y_pred_actual = scaler.inverse_transform(y_pred.reshape(-1, 1))
#find the coefficients and intercept
mse = mean_squared_error(y_test_actual, y_pred_actual)
print(f"The LASSO MSE is {mse}")
# Create a DataFrame for plotting
plot_data = pd.DataFrame({'Actual': y_test_actual.flatten(), 'Predicted': y_pred_actual.flatten()})
# Create a scatter plot
fig = px.scatter(plot_data, x='Actual', y='Predicted', title='Lasso Model Performance')
# Customize the plot
fig.update_layout(
xaxis_title='Actual Critical Temperature',
yaxis_title='Predicted Critical Temperature',
showlegend=True,
legend_title='Data Points',
width=600,
height=500
)
# Show the plot
fig.show()
#336
The LASSO MSE is 336.8236368162868
# Get the coefficients of the Lasso model
lasso_coeffs = lasso_model.coef_
# Create a DataFrame to display the coefficients along with the corresponding feature names
coefficients_df = pd.DataFrame({'Feature': data_x.columns, 'Coefficient': lasso_coeffs})
# Sort the coefficients by absolute value to visualize importance
coefficients_df['Abs_Coefficient'] = abs(coefficients_df['Coefficient'])
coefficients_df = coefficients_df.sort_values(by='Abs_Coefficient', ascending=False)
# Display the top N important features (e.g., top 10)
top_n = 10
top_features = coefficients_df.head(top_n)
# Print the top 10 important features
print(f"Top {top_n} Important Features:")
print(top_features)
# Create a bar plot to visualize the top N important features
import plotly.graph_objects as go
fig_coeffs = go.Figure()
fig_coeffs.add_trace(go.Bar(x=top_features['Feature'], y=top_features['Abs_Coefficient']))
fig_coeffs.update_layout(
title=f'Top {top_n} Important Features - Lasso Model',
xaxis_title='Feature',
yaxis_title='Absolute Coefficient',
width=800,
height=400
)
# Show the bar plot for coefficients
fig_coeffs.show()
Top 10 Important Features:
Feature Coefficient Abs_Coefficient
136 Ba 0.014870 0.014870
49 std_ElectronAffinity 0.004856 0.004856
44 wtd_gmean_ElectronAffinity -0.002778 0.002778
62 wtd_mean_ThermalConductivity 0.002662 0.002662
88 O 0.002382 0.002382
42 wtd_mean_ElectronAffinity 0.002311 0.002311
50 wtd_std_ElectronAffinity -0.001944 0.001944
64 wtd_gmean_ThermalConductivity -0.001941 0.001941
47 range_ElectronAffinity -0.001800 0.001800
10 wtd_std_atomic_mass -0.001728 0.001728
# Train the Ridge model
ridge_model = Ridge(alpha=0.01) # You can adjust the alpha parameter
ridge_model.fit(x_train, y_train)
# Test the Ridge model
y_pred_ridge = ridge_model.predict(x_test)
# Inverse transform the normalized Ridge predictions
y_pred_ridge_actual = scaler.inverse_transform(y_pred_ridge.reshape(-1, 1))
# Find the MSE of the test
mse_ridge = mean_squared_error(y_test_actual, y_pred_ridge_actual)
print(f"The Ridge MSE is {mse_ridge}")
# Create a DataFrame for plotting
plot_data_ridge = pd.DataFrame({'Actual': y_test_actual.flatten(), 'Predicted': y_pred_ridge_actual.flatten()})
# Create a scatter plot for Ridge model performance
fig_ridge = px.scatter(plot_data_ridge, x='Actual', y='Predicted', title='Ridge Model Performance')
# Customize the Ridge plot
fig_ridge.update_layout(
xaxis_title='Actual Critical Temperature',
yaxis_title='Predicted Critical Temperature',
showlegend=True,
legend_title='Data Points',
width=600,
height=500
)
# Show the Ridge plot
fig_ridge.show()
The Ridge MSE is 289.35813124754634
# Train the Ridge model
ridge_model = Ridge(alpha=0.01) # You can adjust the alpha parameter
ridge_model.fit(x_train, y_train)
# Get the coefficients of the Ridge model
ridge_coeffs = ridge_model.coef_[0] # Take the first element to get the coefficients
# Create a Series to display the coefficients along with the corresponding feature names
coefficients_ridge_series = pd.Series(ridge_coeffs, index=data_x.columns)
# Sort the coefficients by absolute value to visualize importance
coefficients_ridge_series_abs = coefficients_ridge_series.abs().sort_values(ascending=False)
# Display the top N important features (e.g., top 10)
top_n_ridge = 10
top_features_ridge = coefficients_ridge_series_abs.head(top_n_ridge)
# Print the top N important features for Ridge
print(f"Top {top_n_ridge} Important Features (Ridge):")
print(top_features_ridge)
# Create a bar plot to visualize the top N important features for Ridge
import plotly.graph_objects as go
fig_coeffs_ridge = go.Figure()
fig_coeffs_ridge.add_trace(go.Bar(x=top_features_ridge.index, y=top_features_ridge.values))
fig_coeffs_ridge.update_layout(
title=f'Top {top_n_ridge} Important Features - Ridge Model',
xaxis_title='Feature',
yaxis_title='Absolute Coefficient',
width=800,
height=400
)
# Show the bar plot for coefficients for Ridge
fig_coeffs_ridge.show()
Top 10 Important Features (Ridge): entropy_Valence 0.373793 wtd_entropy_Valence 0.341844 wtd_entropy_fie 0.267281 wtd_entropy_FusionHeat 0.138216 entropy_fie 0.119500 entropy_FusionHeat 0.114483 entropy_atomic_mass 0.114102 wtd_entropy_ElectronAffinity 0.106014 entropy_atomic_radius 0.105997 wtd_std_Valence 0.093391 dtype: float64